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Abstract 

To include parameter uncertainty into probabilistic climate forecasts one must first specify a prior. 
We advocate tlie use of objective priors, and, in particular, the Jeffreys' Prior. In previous work we 
have derived expressions for the Jeffreys' Prior for the case in which the observations are independent 
and normally distributed. These expressions make the calculation of the prior much simpler than 
f~i ' evaluation directly from the definition. In this paper, we now relax the independence assumption and 

Oh| derive expressions for the Jeffreys' Prior for the case in which the observations are distributed with 

a multivariate normal distribution with constant covariances. Again, these expressions simplify the 
r^ ' calculation of the prior: in this case they reduce it to the calculation of the differences between the 

• , ensemble means of climate model ensembles based on different parameter settings. These calculations 

fj ■ are simple enough to be applied to even the most complex climate models. 

•i-H 

r-| ■ 1 Introduction 



Predicting the future climate is very difficult: this can be seen in the wide spread of predictions and 
projections that come from different climate prediction models ( IPCQ 12007 ). This wide spread is in- 



►^ . dicative of forecast uncertainty, and for those who might use climate predictions, it is important that 

^^ ' this uncertainty is quantified as well as possible. Only then can decisions be made as to whether certain 

lO ■ climate predictions should be used or ignored, and, if they are to be used, how they should be used and 

CO ' how much weight they should be given. 

One major driver of climate prediction uncertainty is parameter uncertainty: that is, that different pa- 

VO ' rameter settings in climate models lead to different predictions, and that no single set of parameters 

^^ , is correct. In statistical climate predictions, parameter uncertaint y can be estim ated and incorporated 

^"^ ' rather easily using standard statistical techniques. For instance, IJewsonI (|2008l ) discusses a Bayesian 



method for putting parameter uncertainty into predictions from flat-line and linear trend climate pre- 
diction models. In numerical model climate predictions, however, parameter uncertainty is rather more 
K^ . difficult to estimate and incorporate into predictions since the models are more complex and the number 

\^ ' of parameters is much larger. Som e attempts t o deal with numerical model parameter uncertainty using 

5t 1 classical statisti cs are described i n Allen et al.l ( 20091) . and some that use subjective Bayesian statistics 



are described in iTomassini et ahl (|2007l) . We, however, are particularly interested in the idea of using a 



third statistical paradigm known as objective Bayesian statistics to capture the parameter uncertainty 
in numerical climate models. Objective Bayesian statistics is Bayesian statistics in which those priors 
which cannot be determined from previous independent studies are determined using a rule, rather than 
subjectively using in tuition. We have described how this ap proach can be applied to climate models in 
Jewson et al.l (|2009f ). We proposed using the Jeffreys' prior (iJeffrevd . Il946() which is the most standard 



of the various rules available, and we then showed that the implementation of such an approach can 
be simplified by using a parametric form for the predictive distribution from the climate model. This 
simplifies the calculation of the prior, because the two steps of differentiating the probabilities from the 
climate model and taking the expectation over all possible realisations are replaced by a single step of 

differentiating the parameters of the fitted distributiom 

One of the shortcomings of the method described in Ijewson et al.l ( 20091 ) is that we assumed that the 



observables (or predicted variables) were independent. This simplifies matters because it means that 
the predictive distribution factorises into a product of predictive distributions for each variable. Since 
we assumed a multivariate normal predictive distribution, that factorised into a product of independent 
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normal distributions, and the evaluation of the prior became a matter of differentiating the means and 

the variances of those normal distributions. 

In this paper, we stick with the assumption that the observations are multivariate normally distributed. 

However, we now relax the assumption that the observations are independent, and replace it with the 

assumption that the covariance matrix between the observations is constant as a function of the climate 

model parameters. This allows us to derive another set of relatively simple expressions for the prior, that 

could readily be evaluated using a suitably designed set of integrations of a numerical model. 

In section [2] we give a brief overview of objective priors and the work in Ijewson et al.l ( 20091 ). In sec- 



tion [3] we then derive expressions for the new case we consider in this paper, for the constant variance 
multivariate normal. In section 2] we summarise. 

2 The Use of Objective Priors in Climate Modelling 

The starting point for Bayesian methods for making probabilistic forecasts that include parameter uncer- 
tainty is the following equation from probability theory, sometimes known as the law of total probability: 

piy\x) = J p{y\0)p{e\x)d9 (1) 

This equation says that the probability of future event y, given the past data x, is given by an average 
over probabilistic predictions made with all possible parameter values, where the prediction for y from a 
model based on the parameter value 9 is written as p{y\9). The average is a weighted average, where the 
weights on each prediction are given by the likelihood ^(6*12;). 

Thus far, there is nothing Bayesian about this equation, since Bayes' Rule has not yet been used. In 
Bayesian methods the term p{9\x) is evaluated by factorising it using Bayes' Theorem: 

p{9\x) (X p{x\9)p{9) (2) 

The task of evaluating p{9\x) now becomes a question of evaluating p(x\9), which is the probability of 
the past events x given each parameter value 9, and of evaluating the prior p{9). 

Evaluating p{x\9) is, at least conceptually, straightforward since it involves comparing probabilistic pre- 
dictions from a climate model with past observations. The distinction between subjective and objective 
Bayesian methods appears in how the term p{9) is evaluated. In subjective methods p{9) is based on 
intuition, while in objective methods it is based on a rule. The i dea of using a r ule is to make the resulting 
predictions less arbitrary: this is discussed at greater length in Uewson et al.l (|2009() . 
The most widely discussed rule is the Jeffreys' Prior, given by: 



p{9) = Jdct 



^i'd'^\ogp{x\\ 



d9jd6k 



(3) 



If this equation is applied to all parameters, then there is a standard case in which the results are 
questionab le. This p roblem is widely discussed in textbooks on Bayesian statistics (for instance, see 
page_90 in Lee (19 9JZl))- The p roblem was actually resolved, however, by Jeffreys' himself (see page 1345 
of iKass and W asscrmanI ( 19961 ) for an explanation), using the separate treatment of location parameters. 



In Ijewson et a l. (2009) we discussed how Jeffreys' Prior might be evaluated for a climate model. The 
steps in equation [3] in which the log of the probability are first differentiated and then integrated (to 
evaluate the expectation) are somewhat daunting, and likely to be computationally intensive. However, 
they can be simplified if we make the assumption that the output from the model is independent and 
normally distributed. The steps of differentiation and integration can then be performed analytically, 
and the expression above becomes: 



p{9) 



\ [h '^f 99, 89, + af d9, 89, ) ^'^ 



where n is the number of observations used to validate the model, and fii and af are the means and 
variances of initial condition ensembles. To evaluate this new expression, the means and variances from 
initial condition ensembles need to be differentiated with respect to the underlying parameters of the 
climate model 9. One can imagine doing this by running multiple initial condition ensembles (although 
there may be more efhcient methods). This is still not a trivial exercise, but is simpler than differenti- 
ating the probabilities from the model and integrating over all possible realisations. Also, with careful 



experimental design it should be possible to evaluate this expression by re-using the model integrations 

that are needed to calculate the likelihood term p{x\9). 

In the case that the ensemble variance does not vary as a function of the parameters, expression 2] reduces 

to: 



Pi9) = 



\ 



-(g^l^ll) 



3 Jeffreys' Prior for Correlated Observations with Constant Vari- 
ance 

We now consider a slightly different approximation in order to allow for correlations between observations. 

We still assume that the observations come from a multivariate normal distribution, as before. However, 

instead of assuming that the observations are independent, but that the ensemble variance can vary as 

a function of the parameters, we now make a complementary set of approximations in which we assume 

that the observations are correlated, but that the covariance matrix is constant as a function of the 

parameters. 

In general, probability densities from the multivariate normal distribution are given by: 

Pi^\^) = /o .n. Trrexp ( --{x - nfS{x - fi) I (6) 

{2tt) 2 Z)2 V 2 / 

where 

• X is a vector for the observables, or predicted variables, produced by the model 

• 9 is a vector for the parameters in the model 

• fi=^{9) is a vector of the mean response of the model for each observable (i.e. the ensemble mean 
of X for an infinite-sized initial condition ensemble for fixed parameters 9) 

• S = S(6') is the covariance matrix of the response of the model between observables (i.e. the 
ensemble covariance matrix of x for an infinite-sized initial condition ensemble for fixed parameters 
0) 

• S = E~^ — S{9) is the inverse of the covariance matrix 

• and D ~ det(S]) = D{9) is the determinant of the covariance matrix 
This gives: 

lnp{x\9) = -- \n2Tr -- \n D - -{x - fifS{x^n) (7) 

= -^\n2Tr-^\nD-^{x'^Sx-x'^Sfi-fi'^Sx + n'^Sfi) (8) 

Since S is symmetric, x^ Sfj, — fr^ Sx (see appendix 1) which means that the above expression for lnp(a;|6') 
simplifies a little to: 

\np{x\9) = ~^ln2Tr-^\nD-^{x'^Sx-2x'^Sn + ii'^Sii) (9) 

We now consider two cases. 

3.1 Single parameter, multiple correlated observations 

The first case we consider, as a warm-up, is where there is just a single parameter in the climate model. 
If we consider 9 to be this single (scalar) parameter, then: 

d\np{x\9) _ l aini:' 1 ^dS r diSfi) I djpL^SpL) 

m " '2~dr'2'' a^""^^ ~^r~2 ¥9 ^ ' 



We now make the further approximation that the covariance matrix S is constant, which simphfies the 
subsequent algebra considerably. If E is constant, then both D and 5* are also constant, and so: 



d\np[x[ 



,Tc^^ 19(^*^5^*) 



x'S 



de 



86 



x'S- 



^M 1 ( Tc^^ ^^"^ 



S^ 



89 2 V 89 86 
Again, since S is symmetric, n S-^ — -gg-S^, which simplifies the above expression to 

8liip(x[ 



86 89 ^ 89 



Taking another derivative wrt 6 gives: 

d^\np{x\e) 



89^ 



^ 89^ '" 86^ 86 86 



Taking expectations over x (and noting that E(x) — fj.) gives: 



E 



(9^1np(x| 
96*2 



= E x'S 



'9612 
,92^ 



E{h^S 



8^ 
86^ 



-E 



86 86 



^(^ )'5lJ^-M S-- 
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86 



86^ 



86 ^86 



fi^S 



5V ..Ta^^f^ dtF ^8y. 
86^ 



fi^S 



86^ 



S- 



86 86 



^m"^ gdii 



(11) 
(12) 

(13) 
(14) 

(15) 

(16) 
(17) 
(18) 



89 89 

That the second derivatives cancel and disappear from this expression is no surprise: we know fro m a 

standard result that the Jeffreys' Prior can only contain first derivatives (see lemma 2, page 87, iLee 

11997)). 

The Jeffrey's Prior is then given by: 



p{9) 



89 89 



(19) 



If there are n observations then: 
• /i is an n x 1 vector 



dfi 



is an n X 1 vector 



de 
• /i"'" is a 1 X n vector 



0^ 
de 



is a 1 X n vector 



• S" is an n X n matrix 

• o/i o ~7T^ IS a scalar 

• and p{9) is a scalar 

In the case in which the observations are independent, E reduces to a diagonal matrix with the variances 
of each of the n observations erf , cr|, ..., cr2 on the diagonal, and S* is a diagonal matrix with the inverse 
variances on the diagonal. The matrix multiplication in the above expression then reduces to a simple 
sum over the observations: 



p{9) - 



89 89 



oc 



E9/XiJ_9/Z£ 
. 89 cr2 de 



which agrees with the uncorrelated observations case given by equation 20 in ljewson et al.l (|2009l) . 



(20) 
(21) 

(22) 



3.2 Multiple parameter, multiple observations 

We now consider the more general case with muhiple parameters. As a first step we consider two 
parameters. Taking the derivative of — "gg ' ' , from equation [T3l above, wrt a second parameter cj) gives: 

d^\np{x\0) _ To ^V Tg ^^M d^i^ ^d^i 

dOdc^ ~ "" d9d<j> ^ d9d^ d^ 09 ^ ' 

Taking expectations gives: 

e('-^W^) - e(^'S^)-e(,^S^)-e{^^s'J^) (24) 



tac ^^^ Tc ^^^ 9fi^ dn 



^(^')'T^^^^''9ek~^'io (^^) 



^^ ^T^K^-t^ •5'— — - - — — 5-- (26) 



961 

- f 4 <-) 

Generahsing from two to multiple parameters, and writing the vector of parameters as 9, the Jeffreys' 
Prior is then: 

PiO) = J<i-t(^S^] (28) 



89, d9k 



If there are n observations and m parameters then: 

• /i is an n X 1 vector 

• 3^ is an n X 777, matrix 

• /i"'" is a 1 X n vector 

• -^ is an TO X n matrix 

• S" is an n X 71 matrix 

• -§g-S-g^ is an 771 x 771 matrix 

• det(^S^) is a scalar 

• and p{9) is a scalar 

In the independent observations case this reduces to 

p{9) = 



rm^^) 



\ 



-testis <-> 



which agrees with the uncorrelated observations case given by equation 37 in I Jewson et al.l ( 20091 ). and 
is also given above as equation [5] 

4 Summary and Discussion 

Bayesian statistics offers the only effective framework for including parameter uncertainty into probabilis- 
tic forecasting. Part of the Bayesian approach involves specifying a prior distribution for the parameters, 
and there are two options for how to do this: take a subjective approach, where the prior is based on 
intuition, and take an objective approach, where the prior is based on a rule. 



Of the various rules that might be used in the objective approach, one in particular is the most widely 
discussed, and is the closest to being an accepted standard: the Jeffreys' Prior. When we consider how to 
apply Jeffreys' Prior in climate modelling, we find that the probabilities from initial condition ensemble 
runs of a climate model need to be differentiated with respect to the parameters of the model, and 
integrated over all possible model states. This could be done, but is cumbersome. However, by making 
parametric assumptions for the shape of the predictive distributions there is the potential for this to be 
simplified. The most obvious parametric assumption is that all output from the model is distributed 
according to the multivariate normal. We have now considered two special cases of this. In lJewson et al.l 
(|2009l) we considered the case where the observations were considered to be independent, but where both 
the means and the variances of initial condition ensembles could vary as a function of the parameters. 
In this paper we have considered the case where the observations are not independent, but where the 
covariance matrix between observations is constant. Neither of these two cases is a special case of the 
other, clearly. 

We have derived expressions for the Jeffreys' Prior for the constant covariance case. These expressions 
show that evaluating the Jeffreys' Prior reduces to evaluating first derivatives of the ensemble mean with 
respect to the various parameters in the model, and performing simple a calculation using those first 
derivatives. This calculation involves the correlation matrix between different observations, which can be 
estimated from the grand ensemble of all models runs with all parameter values, since we are assuming 
that the correlation matrix is constant. 

Overall we have presented a method for making objective probabilistic forecasts which is sufficiently simple 
that it could be applied to real climate models, even when the observations are considered correlated. 
Our next challenge is to attempt to tackle the general multivariate normal case for correlated observations 
with non-constant covariance matrix, and to consider predictive distributions other than normal. 
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